Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

نویسندگان

Julien Pérolat

Florian Strub

Bilal Piot

Olivier Pietquin

چکیده

This paper addresses the problem of learning a Nash equilibrium in γ-discounted multiplayer general-sum Markov Games (MGs) in a batch setting. As the number of players increases in MG, the agents may either collaborate or team apart to increase their final rewards. One solution to address this problem is to look for a Nash equilibrium. Although, several techniques were found for the subcase of two-player zero-sum MGs, those techniques fail to find a Nash equilibrium in general-sum Markov Games. In this paper, we introduce a new definition of ǫ-Nash equilibrium in MGs which grasps the strategy’s quality for multiplayer games. We prove that minimizing the norm of two Bellmanlike residuals implies to learn such an ǫ-Nash equilibrium. Then, we show that minimizing an empirical estimate of the Lp norm of these Bellman-like residuals allows learning for general-sum games within the batch setting. Finally, we introduce a neural network architecture that successfully learns a Nash equilibrium in generic multiplayer generalsum turn-based MGs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning in Markov Games with Incomplete Information

The Markov game (also called stochastic game (Filar & Vrieze 1997)) has been adopted as a theoretical framework for multiagent reinforcement learning (Littman 1994). In a Markov game, there are n agents, each facing a Markov decision process (MDP). All agents’ MDPs are correlated through their reward functions and the state transition function. As Markov decision process provides a theoretical ...

متن کامل

Cyclic Equilibria in Markov Games

Although variants of value iteration have been proposed for finding Nash or correlated equilibria in general-sum Markov games, these variants have not been shown to be effective in general. In this paper, we demonstrate by construction that existing variants of value iteration cannot find stationary equilibrium policies in arbitrary general-sum Markov games. Instead, we propose an alternative i...

متن کامل

Nonzero-sum Risk-sensitive Stochastic Games on a Countable State Space

The infinite horizon risk-sensitive discounted-cost and ergodic-cost nonzero-sum stochastic games for controlled Markov chains with countably many states are analyzed. For the discounted-cost game, we prove the existence of Nash equilibrium strategies in the class of Markov strategies under fairly general conditions. Under an additional geometric ergodicity condition and a small cost criterion,...

متن کامل

A Study of Gradient Descent Schemes for General-Sum Stochastic Games

Zero-sum stochastic games are easy to solve as they can be cast as simple Markov decision processes. This is however not the case with general-sum stochastic games. A fairly general optimization problem formulation is available for general-sum stochastic games by Filar and Vrieze [2004]. However, the optimization problem there has a non-linear objective and non-linear constraints with special s...

متن کامل

Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games

Reinforcement learning turned out a technique that allowed robots to ride a bicycle, computers to play backgammon on the level of human world masters and solve such complicated tasks of high dimensionality as elevator dispatching. Can it come to rescue in the next generation of challenging problems like playing football or bidding on virtual markets? Reinforcement learning that provides a way o...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

نویسندگان

چکیده

منابع مشابه

Learning in Markov Games with Incomplete Information

Cyclic Equilibria in Markov Games

Nonzero-sum Risk-sensitive Stochastic Games on a Countable State Space

A Study of Gradient Descent Schemes for General-Sum Stochastic Games

Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games

عنوان ژورنال:

اشتراک گذاری